Approximation algorithms for NMR spectral peak assignment
نویسندگان
چکیده
We study a constrained bipartite matching problem where the input is a weighted bipartite graph G = (U, V,E), U is a set of vertices following a sequential order, V is another set of vertices partitioned into a collection of disjoint subsets, each following a sequential order, and E is a set of edges between U and V with non-negative weights. The objective is to find a matching in G with the maximum weight that satisfies the given sequential orders on both U and V , i.e. if ui+1 follows ui in U and if vj+1 follows vj in V , then ui is matched with vj if and only if ui+1 is matched with vj+1. The problem has recently been formulated as a crucial step in an algorithmic approach for interpreting NMR spectral data [16]. The interpretation of NMR spectral data is known as a key problem in protein structure determination via NMR spectroscopy. Unfortunately, the constrained bipartite matching problem is NP-hard [16]. We first propose a 2-approximation algorithm for the problem, which follows directly from the recent result of Bar-Noy et al. [2] on interval scheduling. However, our extensive experimental results on real NMR spectral data illustrate that the algorithm performs poorly in terms of recovering target-matching edges. We then propose another approximation algorithm that tries to take advantage of the “density” of the sequential order information in V . Although we are only able to prove an approximation ratio of 3 log2D for this algorithm, where D is the length of a longest string in V , the experimental results demonstrate that this new algorithm performs much better on real data, i.e. it is able to recover a large fraction of target-matching edges and the weight of its output matching is often in fact close to the maximum. We also prove that the problem is MAX SNP-hard, even if the input bipartite graph is unweighted. We further present an approximation algorithm for a nontrivial special case that breaks the ratio 2 barrier. ∗Department of Mathematical Sciences, Tokyo Denki University, Hatoyama, Saitama 350-0394, Japan. Email: [email protected]. Supported in part by the Grant-in-Aid for Scientific Research of the Ministry of Education, Science, Sports and Culture of Japan, under Grant No. 12780241. Work done while visiting at UC Riverside. †Department of Computer Science, University of California, Riverside, CA 92521. Email: [email protected]. Supported in part by a UCR startup grant and NSF Grants CCR-9988353 and ITR-0085910. ‡Department of Computing Science, University of Alberta, Edmonton, Alberta T6G 2E8, Canada. Email: [email protected]. Supported in part by Startup grant G227120195 from the University of Alberta, NSERC Research Grants OGP0046613 and OGP0046506, a CITO grant. §Department of Computer Science, University of California, Riverside, CA 92521. Email: [email protected]. Supported in part by NSF Grants CCR-9988353. ¶Department of Computer Science, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada. Email: [email protected]. Supported in part by NSERC Research Grant OGP0046506. ‖Department of Energy, under Contract DE-AC05-00OR22725, managed by UT-Battelle, LLC. Protein Informatics Group, Oak Ridge National Laboratory. Oak Ridge, TN 37831-6480. Email: xud,[email protected]. Supported by the Office of Biological and Environmental Research, U.S.
منابع مشابه
Improved algorithms for 2-interval scheduling and NMR spectral peak assignment
We consider the 2-interval scheduling problem (2-ISP) defined as follows. We are given a discrete time interval I and a set J of jobs to be executed on a single machine during I. Each job v ∈ J requires either one or two contiguous time units of I and has a profit w(v, t) if v is started at time point t of I. Our goal is to maximize the total profit of the executed jobs. It has been recently sh...
متن کاملRIBRA-An Error-Tolerant Algorithm for the NMR Backbone Assignment Problem
We develop an iterative relaxation algorithm called RIBRA for NMR protein backbone assignment. RIBRA applies nearest neighbor and weighted maximum independent set algorithms to solve the problem. To deal with noisy NMR spectral data, RIBRA is executed in an iterative fashion based on the quality of spectral peaks. We first produce spin system pairs using the spectral data without missing peaks,...
متن کاملMore Reliable Protein NMR Peak Assignment via Improved 2-Interval Scheduling
Protein NMR peak assignment refers to the process of assigning a group of "spin systems" obtained experimentally to a protein sequence of amino acids. The automation of this process is still an unsolved and challenging problem in NMR protein structure determination. Recently, protein NMR peak assignment has been formulated as an interval scheduling problem (ISP), where a protein sequence P of a...
متن کاملAutomated backbone assignment of labeled proteins using the threshold accepting algorithm.
The sequential assignment of backbone resonances is the first step in the structure determination of proteins by heteronuclear NMR. For larger proteins, an assignment strategy based on proton side-chain information is no longer suitable for the use in an automated procedure. Our program PASTA (Protein ASsignment by Threshold Accepting) is therefore designed to partially or fully automate the se...
متن کاملNvAssign: protein NMR spectral assignment with NMRView
MOTIVATION Nuclear magnetic resonance (NMR) protein studies rely on the accurate assignment of resonances. The general procedure is to (1) pick peaks, (2) cluster data from various experiments or spectra, (3) assign peaks to the sequence and (4) verify the assignments with the spectra. Many algorithms already exist for automating the assignment process (step 3). What is lacking is a flexible in...
متن کاملHigh resolution 4D HPCH experiment for sequential assignment of 13C-labeled RNAs via phosphodiester backbone
The three-dimensional structure determination of RNAs by NMR spectroscopy requires sequential resonance assignment, often hampered by assignment ambiguities and limited dispersion of (1)H and (13)C chemical shifts, especially of C4'/H4'. Here we present a novel through-bond 4D HPCH NMR experiment involving phosphate backbone where C4'-H4' correlations are resolved along the (1)H3'-(31)P spectra...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Theor. Comput. Sci.
دوره 299 شماره
صفحات -
تاریخ انتشار 2003